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Abstract 



In this paper, we propose a fast 2-D block-based motion estimation algorithm called Particle Swarm, Op- 
ij^ ', timization - Zero-motion Prejudgment(PSO-ZMP) which consists of three sequential routines: l)Zero-motion 

X^/y ' prejudgment. The routine aims at finding static macroblocks(MB) which do not need to perform remaining 

search thus reduces the computational cost; 2)Predictive image coding and 3)PSO matching routine. Simu- 
lation results obtained show that the proposed PSO-ZMP algorithm achieves over 10 times of computation 
less than Diamond Search(DS) and 5 times less than the recent proposed Adaptive Rood Pattern Search- 
ing(ARPS). Meanwhile the PSNR performances using PSO-ZMP are very close to that using DS and ARPS in 
some less-motioned sequences. While in some sequences containing dense and complex motion contents, the 
PSNR performances of PSO-ZMP are several dB lower than that using DS and ARPS but in an acceptable 
degree. 



1 Introduction 



With the increasing popularity of technologies such as digital television, Internet streaming video and video 
conferencing, video compression has became an essential component of broadcast and entertainment media. 
£f*^ \ Among various kinds of approaches, block-based motion estimation and compression are most widely accepted 
t-H ■ ones. The block-matching algorithm(P>yiA) for motion estimation(MPj) has been adopted in many international 
O'N standards for digital video compression, such as H.264 and MPEG 4 [8]. In the framework of video coding, the 
statistical redundancies can be categorized by either temporal or spatial. For the purpose of reducing the temporal 
redundancies among frames, motion estimation was applied [4]. Block-based matching algorithms consider each 
frame in the video sequence formed by many nonoverlapping small regions, called the marcoblocks(MB) which 
are often square-shaped and with fixed-size(16 x 16 or 8 x 8). Let B m represents the mth MB and M. the number 
of blocks, and A4 = 1, 2, • • • , M; let A be the entire frame and the partition into MBs should satisfy (J B m = A 
and B m (^\B n — 0,to ^ n [14]. Given a MB B m in the anchor frame, the motion estimation problem is to 
determine a corresponding matching MB B' m in the target frame such that the matching error between these two 
blocks is minimized. Then, a motion vector is computed by subtracting the coordinates of the MB in the anchor 
frame from that of the matching MB in the target frame. Instead of sending the entire frame pixel-by-pixel, a 
set of motion vectors is transmitted through the channel which greatly reduces the amount of transmission. In 
the decoder side, a motion compensated procedure is applied to reconstruct frames using the received motion 
vectors and the anchor frame. Referred to many researches, the motion estimation and encoding part consumes 
nearly 70 — 90 percent of the total amount of computation in the whole video compression procedure thus making 
it an active research topic in the last two decades. 

There are many proposals of BMAs in literature. The most basic one is the Exhaustive Search(ES), also 
known as full search which simply compares the given MB in the anchor frame with all candidate MBs in 
the target frame exhaustively within a predefined search region. Previous research showed that ES can obtain 
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high matching accuracy but requires a very large amount of computation thus infeasible to implement in real- 
time video applications. To speed up the search, various fast algorithms for block matching which reduce the 
number of search candidates have been developed. Well known examples are 2-D Logarithmic Search(hOGS) [6], 
Three Step Search(TSS) [10], Four Step Search(ASS) [7], Diamond Search(DS) [9] which is accepted in the 
MPEG-4 Verification Model and widely implemented in VLSI, and the recent proposed Adaptive Rood Pattern 
Search(XKPS) [13] which is almost two or three times faster than DS and even achieves higher peak signal-to-noise 
ratio (PSNR) than that using DS. 

From the optimization point of view, block-based methods can be described by the following minimization [1], 
Vm: 

mme(d m ), e(d m ) = $(4[n] - 4-i[n + d m ]) 

n£8 m 

where Ik is the target frame; Ik-i is the anchor frame; e(d m ) is the matching error; d are the motion vectors 
and V is the search area to which d m belongs, defined as V = n = (rai, n 2 ) : — P < n\ < P, —P < n 2 < P. Sign 
of d is positive when motion of the block is towards positive direction from k — 1th frame to fcth frame. And 
negative when motion of the blcok is in negative direction from k — 1th frame to A:th frame. B m is an V x N 
size MB with the top-left corner coordinate atm= (mi, m 2 ). The goal is to find the best displacement motion 
vector d m for each MB B m , in the sense of the criterion <£>. 

Particle swarm optimization(PSO) was originally proposed by Kennedy and Eberhart in 1995 [5]. It is widely 
accepted and focused by researchers due to its profound intelligence background and simple algorithm structure. 
Currently, PSO has been implemented in a wide range of research areas such as functional optimization, pattern 
recognition, neural network training, fuzzy system control etc. and obtained significant success. Like Genetic 
Algorithm(GA), PSO is also an evolutionary algorithm based on swarm intelligence. But, on the other side, 
unlike GA, PSO has no evolution operators such as crossover and mutation [3]. In PSO, the potential solutions, 
called particles, fly through the solution space by following the current optimum particles. The original intent 
was to graphically simulate the graceful but unpredictable choreography of a bird flock. Through competitions 
and cooperations, particles follow the optimum points in the solution space to optimize the problem. Many 
proposals indicate that PSO is relatively more capable for global exploration and converges more quickly than 
many other heuristic algorithms [2] . 

The rest of the paper is organized as follows. Section II introduces the PSO algorithm and we propose the 
PSO-ZMP block-matching algorithm for motion estimation in Section III. Simulation results and analysis on five 
video sequences are given in Section IV. Section V concludes the paper. 

2 Particle Swarm Optimization 

Particle swarm algorithm is a kind of evolutionary algorithm based on swarm intelligence. Each potential solution 
is considered as one particle, and these particles are distributed stochastically in the high-dimensional solution 
space in the initialization period of the algorithm. Through following the optimum discovered by itself and the 
entire group, each particle periodically updates its own velocity and position. 

Vid(t + l)=wx v id (t) + ci x randi(-) 

*(Pid - x id ) + c 2 x rand 2 (-) 

x{Pgd-x ld ) (1) 

x id (t + 1) = x id (t) + v id (t + 1) (2) 
l<i<N,l<d<D 

Where, N is the number of particles and D is the dimensionality; Vj = (vn, Vi2, ■ ■ ■ , fio), Vi d G [—v max , v max ] 
is the velocity vector of particle i which decides the particle's displacement in each iteration. Similarly, X, = 
(^115^2, • • • ,Xid), Xi d G [—Xmax,Xmax] is the position vector of particle i which is a potential solution in the 
solution space, the quality of the solution is measured by a fitness function; w is the inertia weight which decreases 
linearly during a run; a, c 2 are both positive constants, called the acceleration factors which are generally set to 
2.0; randi(-) and rand 2 {-) are two independent random number distributed uniformly over the range [0, 1]; and 
p g , pi are the best solutions discovered so far by the group and itself respectively. 
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In the t + 1 time iteration, particle i uses p g and pi as the heuristic information to updates its own velocity 
and position. The first term in Eq^represents the diversification, while the second and third intensification. The 
second and third terms should be understood as the trustworthiness towards itself and the entire social system 
respectively. Therefore, a balance between the diversification and intensification is achieved based on which the 
optimization progress is possible. 

3 Block-matching algorithm based on PSO-ZMP 

In this paper, an algorithm based on Particle Swarm Optimization(PSO) and Zero-Motion Prejudgment(ZMP) 
is proposed to reduce the computation and obtain satisfied compensated video quality. The PSO-ZMP algorithm 
consists of three sequential routines. l)Zero-motion prejudgment; 2)Predictive image coding; 3)PSO matching. 
Instead of distributed stochastically in the entire matching space, we also devise a novel distribution pattern for 
particle initialization to bear the center-biased characteristics of common motion fields. 

3.1 Performance Evaluation Criterion 

As widely adopted, we measure the amount of computation and the quality of compensated video sequence by 
Computation and Peak Signal-to- Noise Ratio (PSNR). Computation is defined as the average number of the error 
function evaluations per MV generation. Due to the minimum computational cost, we choose Summed Absolute 
Difference (SAD) as the error function which is defined as follows: 

N N 

SAD = -J2J2(\h(i,j) ~ h-i(i,j)\) (3) 

i=l j=l 

where the size of a MB is N x N. 

The motion estimate quality between the original I ogn and the compensated video sequences I cm p is measured 
in PSNR which is defined as: 

I 2 

PSNR= 10 log 10 ^p 



K N N 

a\ = MSE = jjY. EE^C'J- fc ) " WU, k)f 

k=0 i=0 j=0 

where K is the number of frames in the video sequence. 

3.2 Zero-Motion Prejudgment 

Zero-Motion Prejudgment (ZMP) was firstly introduced in [13]. Data shown in [13] represented that in most of 
test sequences, more than 70% of the MBs are static which do not need the remaining search. So, significant 
reduction of computation is possible if we perform the ZMP procedure before the follow-up predictive coding and 
PSO matching routine. We first calculate the matching error(SAD in this paper) between the MB in the anchor 
frame and the MB at the same location in the target frame and then compare it to a predetermined threshold, 
saying A. If the matching error is smaller than A, we consider this MB static which do not need any further 
motion estimation, and return a [0,0] as its motion vector(MV). 

3.3 Predictive Image Coding 

Based on the center-biased characteristics in video sequences, that is, certain MBs are highly correlated in local 
regions of the frame, the encoder creates a prediction of a region of the current frame based on previously encoded 
and transmitted frames. 

If the frame is processed in raster order, the current-encoded MB should have four patterns of region of 
support(ROS) that consists of the neighboring blocks whose MVs will be used to compute the predicted MV for 
prediction in Fig. ^ due to the limited computational cost. Experiment mentioned in [13] shows there is little 
PSNR difference using these four ROS patterns in the predictive coding routine, and ROS type D consumes least 
amount of computation because of its simplest structure. Thus, ROS pattern D is adopted in this paper. 
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Figure 1: Four types of ROS for current-encoded MB. (The block marked "O" is the current-encoded MB; 
Blocks in grey are the reference MBs for prediction.) 
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Figure 2: Patterns to initial particles 



3.4 Selection of Search Patterns 

Due to the spatial correlation characteristics between MBs in one frame, during the initiation period of the 
PSO matching routine, we distribute the particles in four specific patterns(Fig. [5J with a view to reduce the 
computational cost but to achieve higher PSNR. 

Since frames are processed in raster order, the MB in the top-left corner in the frame, can not be predictive 
coded because there is no reference MB for prediction in the current-encoded frame. Thus, for this condition, we 
simply skip the predictive coding and begin PSO searching routine directly with the initial positions of particles 
in the pattern type B in Fig. [21 

For those MBs located at the leftmost column of frames, their reference MBs used in predictive coding are 
in the other side of the frame, thus may not be highly correlated and inefficient in prediction. So, we also solely 
perform the PSO searching routine in this case, with the pattern type D in Fig. [21 And, for the last leftmost 
MB been processed in the frame, that is, the MB in the bottom-left corner, we use the pattern type C in Fig. [21 
instead. 

Otherwise, pattern type A in Fig. [21 is adopted. We put four particles in a rood shape with size zero(size 
refers to the distance between any vertex point and the center-point) in the adjacent MBs and four particles 
in a rood shape with size one, and then rotate it by angle n/2. With two rood shape in difference size, we try 
to balance the global exploration and local refined search in order for broader searching space as well as higher 
matching accuracy. Moreover, we distribute particles equally in all directions(8 particles in 8 directions) with a 
view to, in stochastic condition, find the matching MB in each direction with equal possibility. 

Notably, if the position of a particle in the during initialization and a PSO run is out of the boundary of the 
image frame, we simply put the particle in the position nearest to its intended position. 

3.5 Stopping Criterion 

Generally, there are two widely adopted stopping criteria. One is Fixed-iteration, that is, given a certain iteration 
time, saying N, the search stops after N times of iteration. The other is Specified-threshold. During a PSO run, 
the most- fitted value found by the entire group p g , called the "best so far" value will be updated by the particles. 
For minimization problems, we specify a very small threshold e, and if the change of p g during t times of 
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iteration is smaller than the threshold, we consider the group best value very near to the global optimum, thus 
the matching procedure stops. Due to the center-biased characteristics of real-world motion fields, we adopt the 
fixed- iteration method in this paper for reducing the computational cost. 

3.6 Summary of Our Method 

We incorporate the ZMP, the predictive coding and the PSO matching routines together and propose a block- 
matching algorithm for motion estimation based on PSO and ZMP. The algorithm can be summarized in the 
pseudocode below: 



Algorithm 1: PSO-ZMP BMA 

l: for Each frame i do 
2: for Each MB j do 
3: zmpcost i— SAD(Ii-i(j), 
4: if zmpcost < • then 
5: Consider MB j static 

6: motion Vect = [0, 0] 

7: Continue 
else 

if MB j in the leftmost column of frame i then 
if MB j == 1 then 

Initial particles in pattern B, Fig. El 
else if MB j in the bottomright corner of frame i then 

Initial particles in pattern C, Fig. |21 
else 

Initial particles in pattern D, Fig. 
end if 
else 

Initial particles in pattern A, Fig. |21 
Predictive image coding routine 
end if 

Begin PSO matching routine 
for Each iteration time t do 
for Each particle p do 

Evaluate SAD using Eq. and update P g , P p 
Update velocity using Eq. ^ 
Update position using Eq. 
end for 
end for 
end if 
end for 

calculate the motion vector and output 
end for 



4 Experiments and Results 

We've tested our PSO-ZMP algorithm on five test video sequences: Akiyo, Container, Mother & Daughter, News 
and Silent within 100 image frames(except 90 frames in Akiyo due to the limitation of the sequence length). 

4.1 Experimental Settings 
4.1.1 PSO Parameters 

PSO matching is the core routine in our algorithm. In this paper, to balance between computational cost and 
compensated video quality, we adopt the standard PSO with inertia weight [11, 12] which is widely considered as 
the defacto PSO standard. We use the fixed-iteration stopping criterion with max 5 iterations. The max velocity 
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Tabic 1: ZMP threshold A for five test video sequences 



Sequence 


Format 


ZMP Threshold A 


Akiyo 


QCIF 


384 


Container 


QCIF 


512 


Mot. & Dau. 


QCIF 


384 


News 


QCIF 


512 


Silent 


QCIF 


384 



is set to 5. The inertia weight w decreases linearly from 0.9 to 0.4 during a PSO run and two acceleration factors 
Ci, C2 are set to 2.0, as commonly did. 

4.1.2 Motion Estimation Parameters 

• We divide a whole image frame into 16 x 16 MBs in the simulation. 

• We select a ZMP threshold A for each test video sequence correspondingly based on data obtained in 
experiments. The parameters are shown in Tabled 

• We do not restrict the range of candidate matching MBs rigidly by a search window V. Instead, through 
the fixed-iteration and the setting of max velocityparticles search for the matching MB in an area more 
flexible and adaptable. 



4.2 Results and Analysis 

Fig.|21and fig.^lbelow show the simulation results on five test video sequences. For comparison, the performance 
of DS, ARPS, GA-ZMP, the BMA based on the genetic algorithm and PSO-ZMP algorithm are examined. 
Average peak signal-to-noise ra£io(PSNR) per frame of the reconstructed video sequence is computed for quality 
measurement and documented in Tabled The computational gain of our PSO-ZMP to DS(or ARPS) is defined 
by the ratio of matching speed to that of our method, which is shown in Table [3] 

From the results obtained, PSO-ZMP shows significant computational reductions while acceptable drops in 
•peak signal-to-noise ra£io(PSNR). Notably, in sequence Akiyo and Mother & Daughter, our method achieves 
very close PSNR performance (max difference 1.09dB in Mother & Daughter) with 12.04 and 12.44 times of 
computation reductions compared to DS respectively; 4.94 and 5.62 times of computation reductions compared 
to ARPS. In sequence Silent, News and Container, the PSNR performances using our method are 2-4 dB(max 
difference 4.04dB in Silent) less than that of ARPS and DS. But, in those sequences, compared to DS, PSO-ZMP 
consumes over 8-12 times less of computation to that of DS and 3-6 times less to that of ARPS. Referred to [14], 
a PSNR higher than 40dB typically indicates an excellent image(i.e., being very close to the original), between 
30-40dB usually means a good images(i.e., the distortion is visible but acceptable); between 20-30dB PSNR 
is quite poor; and finally, a PSNR lower than 20dB is unacceptable. For all five sequences tested, PSO-ZMP 
algorithm achieves PSNR higher than 30dB in most of the frames, thus the PSNR droppings are in an acceptable 
degree. 

Compared to GA-ZMP which incorporates genetic algorithm and zero-motion prejudgment (ZMP) , our PSO- 
ZMP algorithm achieves superior performances on average PSNR and computation on all five test sequences. 
With the evolution operators such as crossover and mutation, GA consumes more amount of computation which 
leads to 1.5-2 times more computations than that using PSO. Meanwhile, our algorithm with PSO and ZMP 
incorporated obtains higher average PSNR compared to that using GA because PSO is more capable for the 
global exploration and local exploitation [3]. 

5 Conclusion 

In this paper, we have proposed a fast block-based motion estimation algorithm based on Particle Swarm Opti- 
mization(PSO) with novel particle initiation patterns. Applied successfully in many functional and combinatorial 
optimization problems, PSO is proved to have a relevant stronger ability in global exploration. In addition, a 
zero-motion prejudgment (ZMP) routine is incorporated into the PSO BMA to further reduce the computational 
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Figure 3: Simulation results on Akiyo, Container and Mother & Daughter 
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(c) Computations on Silent 



(d) PSNR on Silent 



Figure 4: Simulation results on News and Silent 
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Tabic 2: Average PSNR performance of DS, ARPS and PSO-ZMP 



Sequence 


DS 


ARPS 


GA-ZMP 


PSO-ZMP 


Akiyo 


43.50 


43.49 


42.07 


42.39 


Container 


36.34 


36.13 


32.36 


33.15 


Mot. & Dau. 


40.46 


40.57 


35.66 


39.48 


News 


36.66 


36.61 


35.02 


35.29 


Silent 


36.68 


36.46 


31.62 


32.64 



Table 3: Computational gain to ARPS and to DS 



Sequence 


ARPS to DS 


PSO-ZMP to DS 


PSO-ZMP to ARPS 


PSO-ZMP to GA-ZMP 


Akiyo 


2.44 


12.04 


4.94 


1.47 


Container 


2.24 


8.10 


3.62 


1.54 


Mot. & Dau. 


2.22 


12.44 


5.62 


1.61 


News 


2.32 


9.85 


4.25 


1.68 


Silent 


2.25 


8.60 


3.82 


2.17 



cost of the algorithm. Simulation results show that the PSO-ZMP BMA proposed requires less amount of com- 
putation and achieves PSNR in a acceptable degree of drop, while close and acceptable PSNR performance 
compared to widely accepted ARPS and DS BMA. Moreover PSO just consumes a few lines of codes due to its 
simplicity which makes the PSO-ZMP algorithm attractive for hardware implementation. 

In the future, variants of PSO might be applied to strengthen the global searching ability and the accelerate 
the convergence speed. And, to speed up the search and avoid being trapped in local minima, a multircsolution 
procedure may be used. 
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